在复杂的动态环境中,有效的轨迹产生在无人体表面车辆(USV)域中仍然是一个开放的问题。在本文中,提出了针对USV-UAV系统的合作轨迹计划算法,以确保USV可以在多障碍物图中的自主进步过程中执行安全,平稳的路径。具体而言,无人机(UAV)扮演飞行传感器的角色,并提供了实时的全球地图和障碍信息,并具有轻巧的语义细分网络和3D投影转换。然后通过基于图的搜索方法生成初始的避免轨迹。关于USV的独特运动不足的运动学特性,引入了基于船体动态约束的数值优化方法,以使该轨迹易于跟踪进行运动控制。最后,提出了基于在执行过程中具有最低能量消耗限制的NMPC的运动控制方法。实验结果验证了整个系统的有效性,并且生成的轨迹在局部对USV始终具有相当大的跟踪精度。
translated by 谷歌翻译
无人的表面容器(USV)广泛用于海洋勘探和环境保护场。为了确保USV能够成功执行其任务,轨迹计划和运动跟踪是两种最关键的技术。在本文中,我们根据优化理论提出了一种新型的USV轨迹生成和跟踪方法。具体而言,USV动力学模型以差异平坦度进行描述,因此在最佳边界值的目标下,在线性不变系统表达式中可以通过动态RRT*生成轨迹。为了降低样本数并提高效率,我们通过局部优化调整轨迹。在优化过程中考虑了动态约束,因此生成的轨迹符合未散发船体的运动学特征,并使其更容易跟踪。最后,在顺序二次编程问题下使用模型预测控制添加运动跟踪。实验结果表明,计划的轨迹与USV的运动学特性更加一致,并且跟踪精度仍然更高。
translated by 谷歌翻译
单词嵌入是一项基本的自然语言处理任务,可以学习单词的特征。但是,大多数单词嵌入方法仅分配一个向量为一个单词,即使多序单词具有多声音。为了解决此限制,我们提出了SEMEMEWSD同义词(SWSD)模型,以在Open Hownet中的Word Sense Disampuation(WSD)(WSD)和同义词的帮助下为各种多词的矢量分配不同的向量。我们使用Sememewsd模型,这是一种基于Open Hownet的无监督的词义歧义模型,进行单词sense sense disammaguation并用sense id注释多义单词。然后,我们从Open Hownet获得了单词sense的十大同义词,并将同义词的平均向量作为sense sense的向量。在实验中,我们使用Gensim的WMDistance方法评估了有关语义相似性计算的SWSD模型。它可以提高准确性。我们还检查了不同BERT模型的Sememewsd模型,以找到更有效的模型。
translated by 谷歌翻译
流量预测是智能交通系统中时空学习任务的规范示例。现有方法在图形卷积神经操作员中使用预定的矩阵捕获空间依赖性。但是,显式的图形结构损失了节点之间关系的一些隐藏表示形式。此外,传统的图形卷积神经操作员无法在图上汇总远程节点。为了克服这些限制,我们提出了一个新型的网络,空间 - 周期性自适应图卷积,并通过注意力网络(Staan)进行交通预测。首先,我们采用自适应依赖性矩阵,而不是在GCN处理过程中使用预定义的矩阵来推断节点之间的相互依存关系。其次,我们集成了基于图形注意力网络的PW注意,该图形是为全局依赖性设计的,而GCN作为空间块。更重要的是,在我们的时间块中采用了堆叠的散布的1D卷积,具有长期预测的效率,用于捕获不同的时间序列。我们在两个现实世界数据集上评估了我们的Staan,并且实验验证了我们的模型优于最先进的基线。
translated by 谷歌翻译
Knowledge graphs (KG) have served as the key component of various natural language processing applications. Commonsense knowledge graphs (CKG) are a special type of KG, where entities and relations are composed of free-form text. However, previous works in KG completion and CKG completion suffer from long-tail relations and newly-added relations which do not have many know triples for training. In light of this, few-shot KG completion (FKGC), which requires the strengths of graph representation learning and few-shot learning, has been proposed to challenge the problem of limited annotated data. In this paper, we comprehensively survey previous attempts on such tasks in the form of a series of methods and applications. Specifically, we first introduce FKGC challenges, commonly used KGs, and CKGs. Then we systematically categorize and summarize existing works in terms of the type of KGs and the methods. Finally, we present applications of FKGC models on prediction tasks in different areas and share our thoughts on future research directions of FKGC.
translated by 谷歌翻译
Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.
translated by 谷歌翻译
This paper focuses on designing efficient models with low parameters and FLOPs for dense predictions. Even though CNN-based lightweight methods have achieved stunning results after years of research, trading-off model accuracy and constrained resources still need further improvements. This work rethinks the essential unity of efficient Inverted Residual Block in MobileNetv2 and effective Transformer in ViT, inductively abstracting a general concept of Meta-Mobile Block, and we argue that the specific instantiation is very important to model performance though sharing the same framework. Motivated by this phenomenon, we deduce a simple yet efficient modern \textbf{I}nverted \textbf{R}esidual \textbf{M}obile \textbf{B}lock (iRMB) for mobile applications, which absorbs CNN-like efficiency to model short-distance dependency and Transformer-like dynamic modeling capability to learn long-distance interactions. Furthermore, we design a ResNet-like 4-phase \textbf{E}fficient \textbf{MO}del (EMO) based only on a series of iRMBs for dense applications. Massive experiments on ImageNet-1K, COCO2017, and ADE20K benchmarks demonstrate the superiority of our EMO over state-of-the-art methods, \eg, our EMO-1M/2M/5M achieve 71.5, 75.1, and 78.4 Top-1 that surpass \textbf{SoTA} CNN-/Transformer-based models, while trading-off the model accuracy and efficiency well.
translated by 谷歌翻译
Despite significant progress in object categorization, in recent years, a number of important challenges remain; mainly, the ability to learn from limited labeled data and to recognize object classes within large, potentially open, set of labels. Zero-shot learning is one way of addressing these challenges, but it has only been shown to work with limited sized class vocabularies and typically requires separation between supervised and unsupervised classes, allowing former to inform the latter but not vice versa. We propose the notion of vocabulary-informed learning to alleviate the above mentioned challenges and address problems of supervised, zero-shot, generalized zero-shot and open set recognition using a unified framework. Specifically, we propose a weighted maximum margin framework for semantic manifold-based recognition that incorporates distance constraints from (both supervised and unsupervised) vocabulary atoms. Distance constraints ensure that labeled samples are projected closer to their correct prototypes, in the embedding space, than to others. We illustrate that resulting model shows improvements in supervised, zero-shot, generalized zero-shot, and large open set recognition, with up to 310K class vocabulary on Animal with Attributes and ImageNet datasets.
translated by 谷歌翻译
In this paper, we investigate the joint device activity and data detection in massive machine-type communications (mMTC) with a one-phase non-coherent scheme, where data bits are embedded in the pilot sequences and the base station simultaneously detects active devices and their embedded data bits without explicit channel estimation. Due to the correlated sparsity pattern introduced by the non-coherent transmission scheme, the traditional approximate message passing (AMP) algorithm cannot achieve satisfactory performance. Therefore, we propose a deep learning (DL) modified AMP network (DL-mAMPnet) that enhances the detection performance by effectively exploiting the pilot activity correlation. The DL-mAMPnet is constructed by unfolding the AMP algorithm into a feedforward neural network, which combines the principled mathematical model of the AMP algorithm with the powerful learning capability, thereby benefiting from the advantages of both techniques. Trainable parameters are introduced in the DL-mAMPnet to approximate the correlated sparsity pattern and the large-scale fading coefficient. Moreover, a refinement module is designed to further advance the performance by utilizing the spatial feature caused by the correlated sparsity pattern. Simulation results demonstrate that the proposed DL-mAMPnet can significantly outperform traditional algorithms in terms of the symbol error rate performance.
translated by 谷歌翻译
Deploying reliable deep learning techniques in interdisciplinary applications needs learned models to output accurate and ({even more importantly}) explainable predictions. Existing approaches typically explicate network outputs in a post-hoc fashion, under an implicit assumption that faithful explanations come from accurate predictions/classifications. We have an opposite claim that explanations boost (or even determine) classification. That is, end-to-end learning of explanation factors to augment discriminative representation extraction could be a more intuitive strategy to inversely assure fine-grained explainability, e.g., in those neuroimaging and neuroscience studies with high-dimensional data containing noisy, redundant, and task-irrelevant information. In this paper, we propose such an explainable geometric deep network dubbed as NeuroExplainer, with applications to uncover altered infant cortical development patterns associated with preterm birth. Given fundamental cortical attributes as network input, our NeuroExplainer adopts a hierarchical attention-decoding framework to learn fine-grained attentions and respective discriminative representations to accurately recognize preterm infants from term-born infants at term-equivalent age. NeuroExplainer learns the hierarchical attention-decoding modules under subject-level weak supervision coupled with targeted regularizers deduced from domain knowledge regarding brain development. These prior-guided constraints implicitly maximizes the explainability metrics (i.e., fidelity, sparsity, and stability) in network training, driving the learned network to output detailed explanations and accurate classifications. Experimental results on the public dHCP benchmark suggest that NeuroExplainer led to quantitatively reliable explanation results that are qualitatively consistent with representative neuroimaging studies.
translated by 谷歌翻译